arXiv: 1504.0801 lv2 [cs.DS] 17 Dec 2015 


A Comparison of Approaches for Finding 
Minimum Identifying Codes on Graphs 


Victoria Horan * 

Air Force Research Laboratory 
Information Directorate 

Steve Adachi f 
Lockheed Martin 


Stanley Bak ^ 

Air Force Research Laboratory 
Information Directorate 

December 18, 2015 


D 


Abstract 

In order to formulate mathematical conjectures likely to be true, a 
number of base cases must be determined. However, many combinato¬ 
rial problems are NP-hard and the computational complexity makes this 
research approach difficult using a standard brute force approach on a 
typical computer. One sample problem explored is that of finding a mini¬ 
mum identifying code. To work around the computational issues, a variety 
of methods are explored and consist of a parallel computing approach us¬ 
ing Matlab, an adiabatic quantum optimization approach using a D-Wave 
quantum annealing processor, and lastly using satisfiability modulo theory 
(SMT) and corresponding SMT solvers. Each of these methods requires 
the problem to be formulated in a unique manner. In this paper, we ad¬ 
dress the challenges of computing solutions to this NP-hard problem with 
respect to each of these methods. 
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1 Problem Statement and Background 

First introduced in 1998 m, an identifying code for a graph C? is a subset of 
the vertices, S C V{G), such that for each v S V{G) the subset of vertices 
of S that are adjacent to v is non-empty and unique. That is, each vertex of 
the graph is uniquely identifiable by the non-empty subset of vertices of S to 
which it is adjacent. More formally, let N{v) be the set of vertices adjacent to 
V and B{v) = N{v) U {u}. Then we require that any two vertices u,v G V{G) 
have different identifying sets, or more precisely that we must have S H B{v) ^ 
SC\B(u), and also that both SC\B{v), SC\B{u) 0. The combinatorial problem 
of finding minimum identifying codes has been shown to be NP-complete [B] , but 
also has many potential real-world applications. The most commonly discussed 
application illustrates one use of identifying codes: placing sensors on a network. 
If we place sensors on the nodes corresponding to our identifying code, then the 
set of sensors alerted gives us location information on the trigger. For example, 
if we place smoke detectors in a house using an identifying code, then based 
off of which smoke detectors are set off we can pinpoint exactly which room of 
the house is on fire |22j . Other applications appearing in the literature refer to 
similar scenarios, such as fault diagnosis of multiprocessor systems m- 

As some NP-complete problems are notorious for having special graph classes 
on which there are simple solutions, previous research has focused on the class of 
de Bruijn networks [1]. This paper explores the problem of finding the minimum 
size of an identifying code over the undirected de Bruijn graph using three 
different methods. Section describes an approach for solving the miminum 
identifying code problem using adiabatic quantum optimization [2, which for 
small enough problem instances can be implemented on a D-Wave quantum 
annealing processor m- While to our knowledge this is the first time adiabatic 
quantum optimization has been applied to the identifying code problem, it has 
been studied for other graph-theoretic problems including graph coloring p5] 
and the graph isomorphism problem uni, ESI; in addition, m showed how to 
formulate a number of NP-complete problems as Ising models. Other approaches 
are considered as follows: Subsection |3.1| explores a parallel computing algorithm 
and Subsection |3.2| illustrates the method of using Satisfiability Modulo Theory 
(SMT) solvers. 

While many of our examples and data revolve around the class of de Bruijn 
graphs, the methods discussed throughout can easily be applied to arbitrary 
graphs. As this problem has not been considered before outside of m , no 
results exist on the minimum size of identifying codes on this class of graphs. In 
this paper, we provide initial data on these values. A summary of our complete 
contribution to these values is given in Figure 

For reference, we provide some of the basic definitions and background for 
the class of de Bruijn graphs here. The undirected d-ary de Bruijn graph of 
order n, denoted B{d,n), is the graph with the following vertex and edge sets. 
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Figure 1: The 2-ary de Bruijn graph of order 3, or B{2,3). 


V = {xiX2 ■ ■ ■Xn\ Xi € .. ,d-1}) 

E = {{x,y)\x2X3,...Xn=yiy2---yn-lOY XiX 2 ...Xn-l=y 2 y'i---yn} 

For example, the graph B{2, 3) is illustrated in FigureThese graphs have 
many useful properties for applications, such as having a relatively high number 
of nodes, a low degree at each node, and many short paths between any two 
nodes. Additionally, many notoriously difficult problems such as the traveling 
salesman problem are solvable in polynomial time on this class of graphs. For 
that reason, they have many interesting applications such as interconnection 
networks [20] and fault-tolerant wireless sensor networks [Hj. 

2 Quantum Annealing Approach 

This section describes our approach for solving the minimum identifying code 
problem using quantum annealing. Quantum annealing processors, such as 
those made by D-Wave Systems [l2| , operate on the principle of adiabatic quan¬ 
tum optimization (AQO) |5] . In AQO, the system Hamiltonian evolves according 
to the equation 


H{t) “ ® Hinit + S H final, (1) 

where Hinit is the initial Hamiltonian with a known and easily prepared ground 
state; Hfinal is the final Hamiltonian whose ground state corresponds to the 
solution of our optimization problem; and s(r) increases from s(0) = 0 to s(l) = 
1. According to the adiabatic theorem [T^, for a large enough T and a smooth 
enough function s determined by the minimum spectral gap, a system starting 
in the ground state of Hinit at time t = 0 will be in the ground state of Hfinal at 
time t = T. However, since a physical implementation of AQO does not strictly 
meet the conditions of the adiabatic theorem, in particular the assumption of 
a closed system, we refer to the process as quantum annealing^ which may be 
viewed as a heuristic method for optimization. 
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The D-Wave processor is designed to solve quadratic binary minimization 
problems that can be expressed in terms of an Ising spin glass energy functional. 
Namely, a ground state of Hfinal is a minimum of 

^^hjSj + JijSiSj ( 2 ) 

j 


where Sj = ±1 are the final qubit states in the computational basis, and the hj 
and Jij are programmable qubit biases and coupling strengths respectively. 

While there are various ways of expressing the minimum identifying code 
problem as a binary minimization problem, we found that the most efficient 
approach, as measured by the number of qubits needed to solve the problem 
on the D-Wave processor, was to formulate the problem in terms of Boolean 
satisfiability. This approach, described in Subsection 2.1 leads to a conjunctive 
normal form (CNF) proposition containing clauses of various sizes. 

Since the energy functional must be at most quadratic, we use “gadgets” 
(similar to those found in [T]) to reduce the higher order clauses and generate 
an Ising model whose ground state encodes the solution to the satisfiability 
problem. These gadgets add overhead in the form of ancillary binary variables 
that augment the problem variables; however we found that the overall qubit 
resource requirements for this approach are less than with other approaches 
we considered. Our approach for mapping SAT clauses to an Ising model is 
described in Subsection 12.21 

The physical limitations of the D-Wave architecture present some additional 
challenges: 


• Not every pair of qubits on the chip are physically connected; rather, the 
connectivity can be represented as a square lattice of 1 ^ 4,4 bipartite graphs, 
that D-Wave calls a Chimera graph. Thus if the binary variables in the 
optimization problem are mapped 1-to-I to qubits, not all of the quadratic 
coefficients Jy can be programmed. This challenge can be overcome using 
graph minor embedding [5] , in which case a single binary variable may be 
mapped to multiple physical qubits. 


• Furthermore, an actual D-Wave chip may contain a small number of faulty 
qubits (due to fabrication defects or calibration failures) which cannot be 
used in the graph minor embedding process m- While determining the 
optimal embedding for an arbitrary graph is itself an NP-hard problem, 
heuristic embedding techniques have been developed that provide reason¬ 
ably efficient mappings [5]. 

• Finally, the D-Wave processor has a small amount of intrinsic control error 
(ICE), meaning that the actual values of biases and coupling strengths may 
differ slightly from the values programmed by the user. Some of the effects 
contributing to the ICE can be mitigated using gauge transformations [3] , 
which are explained below. 
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Subsection 2.3 describes the embedding and gauge transformation techniques 
we used to map the problem onto the D-Wave architecture. 

The examples given in Subsections |2 .1 1 through |T4| are for the undirected de 
Bruijn graph n) with d = 2 and n = 4. This was the largest case we were 
able to solve on a D-Wave machine operated by Lockheed Martin and the Uni¬ 
versity of Southern California, which had 504 working qubits. In Subsection |2.5[ 
we also estimate the number of qubits that would be needed to solve the larger 
d = 2 cases with n = 5,6. Compared to other types of optimization problems 
that have been studied using quantum annealing and the D-Wave machine, it 
appears that the qubit resource requirements for the minimum identifying code 
problem on de Bruijn graphs scale relatively well. 

While the satisfiability-based approach proved to be the most efficient for 
this study, we also explored various approaches for expressing the minimum 
identifying code problem as a binary minimization problem. For sake of com¬ 
pleteness, Subsection |2.6| describes two other approaches that we considered. 


2.1 Satisfiability Formulation 

In this subsection, we describe how to formulate the minimum identifying code 
problem in terms of Boolean satisfiability. While the method is presented here 
for a de Bruijn graph, the same approach can be applied to any graph. 

We label the vertices of the de Bruijn graph B{d,n) in lexicographic order 
i = 0,1, 2, ...d" — 1. For a subset S of B{d, n), we define Boolean variables 

f I, if z e s; 

* [0, otherwise. 

We will construct a conjunctive normal form (CNF) formula that must be 
satisfied if S is an identifying code. Later, in the full optimization problem, we 
will add a penalty term of the form A Xi to obtain the minimum identifying 
code. 

For S to be an identifying code, it must intersect the above-defined ball B{i) 
for every i. In other words, for every i the clause 

Ci= y Xj 

must be satisfied for all i. Similarly, the condition S C B{i) ^ S C] B{j) for all 
i ^ j implies that 

must be satisfied for all i ^ j, where 0 is the exclusive OR operator. Thus an 
identifying code must satisfy the proposition 

i 
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For the de Bruijn graph , 8 ( 2 , 4 ), we can compute P explicitly and simplify 
the CNF to obtain 

P ={X2 V X3) A (xq V X2 V X4 V X5 V Xg V Xg) A (xg V X3 V Xq V X7 V Xg V Xg) 

A (xq V xi V a;2 V X4 V a;9 V xig) A (x4 V a;i2) A (xq V xi V xe V xg V X12 V X14) 
A (xo V X3 V X4 V X5 V Xs V Xg) A (xq V Xg V Xg V X7 V Xg V Xg) 

A (xo V X1 V X3 V X4 V Xg V Xio) A (xi V Xg V Xg V Xio) 

A (xi V X4 V Xg V Xio V Xii) A (xo V Xg V X5 V Xg V Xg V X12) 

A (xi V X3 V X5 V X12) A (xi V X2 V Xg V Xio V X13) 

A (xi V Xo V Xg V Xii V Xi4 V X15) A (xi V X3 V X7 V Xg V X12 V X14) 

A (xi V X3 V Xg V Xg V Xi4 V X15) A (X4 V X5 V Xg V Xg V Xii) 

A (xo V XI V X2 V Xg V Xio V X12) A (X3 V Xg V Xio V X12) 

A (X2 V X5 V Xg V Xg V X13) A (X2 V X4 V Xii V X13) 

A (x2 V Xo V X7 V Xio V X13) A (x2 V xg V xg V X13 V X14) 

A (x3 V X5 V X7 V X12) A (x3 V Xio V a^i2 V X14) 

A (x3 V X5 V Xg V Xi3 V Xi4 V X15) A (x3 V Xg V X7 V Xio V X13 V X15) 

A (x3 V xii) A (xo V xi V X4 V Xg V Xg V X14) A (x4 V Xg V X7 V xio V xn) 

A (x4 V Xg V Xg V Xii V X14) A (xg V X7 V xio V X14) 

A (xg V Xg V Xii V X12 V Xi4 V Xig) A (xg V Xg V Xn V X13 V X14 V Xig) 

A (xg V X7 V Xg V Xg V Xi3 V Xig) A (xg V X7 V Xg V Xg V X12 V Xig) 

A (xg V X7 V xio V xii V X12 V xig) A (xg V X7 V xio V xn V X13 V xig) 

A (xi2 V X13) A (xo V Xi V Xg) A (xi V Xg V X4 V Xg V Xg) 

A (xi V X3 V Xg V X7 V Xg) A (x2 V X4 V Xg V Xg V Xio) 

A (x2 V Xg V Xio V xii) A (x4 V Xg V xio V X13) 

A (xg V Xg V X7 V Xii V X13) A (xg V Xg V Xg V X12 V X14) 

A (xg V Xio V Xii V X13 V X14) A (x7 V X14 V Xig) 

Unfortunately, the above formula, consisting of 50 clauses over 16 variables, 
was slightly too large to map onto the architecture of the D-Wave processor 
we were using for this study. We can decompose the problem into smaller 
subproblems by observing that P contains the clauses X3 V Xn and X4 V Xig. 
Hence if P is satisfied, at least one of X3 and xn must be true, and similarly at 
least one of X4 and X12 must be true. So we consider the following 4 cases: 

1) X3 = X4 = 1 

2) X3 = X12 = 1 

3 ) xii = X4 = 1 

4 ) xii = X12 = 1 

While these 4 cases are not disjoint, if we find the minimum identifying codes 
for each case and take the union of the solutions, then the minimum length codes 
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over the union will be the solutions of the original minimum identifying code 
problem. 

We will illustrate the procedure for case 1 . If we set X3 = a;4 = 1 in P, we 
obtain the reduced formula (24 clauses over 14 variables) 

P' =(xo V X1 V Xg V Xg V X12 V Xu) A (xq V X2 V Xg V Xy V Xs V Xg)A 
(xi V Xg V Xs V Xio) A (xq V X2 V Xg V Xs V Xg V Xi2)A 
(xi V X2 V Xg V Xio V X13) A (a;i V Xg V Xg V Xii V 0:14 V a:i5)A 

(xq V Xi V X2 V Xg V Xio V X12) A (x 2 V Xg V Xs V Xg V Xi3)A 

(x 2 V Xg V Xy V Xyg V X13) A (xg V Xg V Xg V 0:13 V a;i4)A 
(xg V Xy V Xio V 2:14) A (xg V Xg V Xu V a:i2 V a;i4 V a:i5)A 
(xg V XgV Xu V a;i3 V a;i4 V Xig) A (xg V Xy V Xs V Xg V a:i3 V a;i5)A 

(a;6 V X7 V Xs V Xg V X12 V X15) A (xg V Xy V Xio V Xii V X12 V Xi5)A 

(xg V Xy V Xio V Xu V X13 V X15) A (xi2 V X13) A (xo V XI V X8)A 
(x2 V Xg V Xio V Xii) A (xs V Xg V Xy V Xu V Xi3)A 

(xg V Xs V Xg V X12 V X14) A (xg V Xig V Xu V X13 V X14) A (xy V X14 V X15) 

This formula is small enough to map onto the D-Wave processor, as will be 
described in the following sections. 

2.2 Mapping SAT Clauses to Ising Models 

The reduced proposition P' for an identifying code has the CNF form 

P' = /\\/ X, 

j ieAj 

where each Aj is a subset of the vertices of the de Bruijn graph. 

To map this to an Ising model, we define “spin” variables S'* = ±1 for each 


We will also define as needed, some “ancillary” variables Zk = ± 1 . 

We construct an Ising Hamiltonian of the form 

n{{s,},zk) = J2H, + xY,s^. 

3 i 

The Hamiltonian is a function of the problem variables {S'i} and ancillary vari¬ 
ables {zk}- Each of the terms Pj will be a function of the {Si : i € Aj}, and 
possibly some of the ancillary variables {zk}, with the following properties: 

• Hj is at most quadratic in {S'i : i G Hj}, and {zk} 

• {S'i} is a minimum of Hj iff \l Xj = 1. 
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We will show momentarily how the Hj are constructed. The last term A > 0 
is a penalty term that rewards smaller size codes. Therefore, the minimum 
solutions (or ground states) of % are the minimum identifying codes. 

To illustrate how the Tij are constructed, consider the 3-OR clause cci V a ;2 V 
x^. If we define 


T~l-3{Si, S2, S3, zi) — S1S2 — 2 SiZi — 2S2Z1 — 2S2Z1 + Z1S3 + Si + S2 — 3 zi — S3 

then it can be easily checked that "Ha attains its minimum value iff at least 
one of the Si = +1, which corresponds to the clause xi V X 2 V X 3 being satisfied. 
Note that "Ha contains no higher than quadratic terms. 

We refer to the mapping from the 3-OR clause to Hs as a “gadget”. The 
gadget can be represented diagrammatically as in Figure which also shows 
gadgets for 4-OR through 6-OR clauses, which are the gadgets we need to map 
our satisfiability formula P' for B{d, n) to an Ising model. In the diagrams, num¬ 
bers attached to a node represent the linear coefhcients in the Ising model, while 
numbers attached to an edge represent the quadratic (coupling) coefficients in 
the Ising model. 

This technique is similar to the ones described in [T], except that those 
gadgets were designed for use with O/I variables instead of ±1 variables. Note 
that the choice of gadget coefficients is far from unique, and it may be possible to 
tune these coefficients, for example to accommodate the limited control precision 
of the quantum processor. However, the coefficients shown in Figure were 
sufficient for the problem at hand. When larger D-Wave processors become 
available that enable solving the minimum identifying code problem for larger 
{d,n), tuning of the gadget coefficients may be needed, along with the gauge 
transformation techniques described in the next subsection. 

Using the gadgets shown in Figure]^ we mapped the satisfiability formula 
P' to an Ising model with 49 ancillary variables {zfc}, for a total of 63 variables. 
We furthermore added the penalty term A Si so that the ground state will 
be a minimum identifying code. Since this Ising model could in principle be 
implemented on an ideal quantum annealing machine with 63 qubits, we say 
that the model has 63 logical qubits, and we refer to this model as the logical 
Ising model. 

While the overhead incurred in the satisfiability-based approach to obtain a 
quadratic Ising model is substantial (going from 14 boolean variables in P' to 
63 logical qubits in the Ising model), one advantage of this approach is that the 
connectivity of the resulting Ising model (i.e. pairs of variables with nonzero 
coefficients) is relatively sparse. This can be seen from Figure]^ In the figure, 
nodes corresponding to the original 14 boolean variables are shown in green; the 
remaining nodes represent the ancillary variables added during the SAT-to-Ising 
mapping process. Edges represent pairs of variables with nonzero coefficients 
{Jij ^ 0). The relative sparsity of this graph will facilitate mapping the problem 
onto the D-Wave architecture, as described in the next subsection. 


2-OR 


3-OR 



5-OR 

+1 +1 +1 +1 



6-OR 

+1 +1 +1 +1 



Figure 2: Mapping from OR-clauses to Ising Models 
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Figure 3: Graph of logical Ising model for the SAT formula P'. Nodes corre¬ 
sponding to the original 14 boolean variables are shown in green; the remaining 
nodes represent the ancillary variables added during the SAT-to-Ising mapping 
process. 

2.3 Mapping the Logical Ising Model onto the D-Wave 
Processor 

This subsection describes additional pre-processing steps that take place before 
the logical Ising model is solved on the D-Wave processor. Graph minor em¬ 
bedding enables the logical Ising model to be mapped onto the limited qubit 
connectivity of the D-Wave architecture, by utilizing potentially multiple phys¬ 
ical qubits per logical qubit. To improve solution accuracy, gauge transforma¬ 
tions are used to partially mitigate the intrinsic control errors that occur when 
programming the D-Wave hardware. 

2.3.1 Graph Minor Embedding 

The physical connectivity between qubits on the D-Wave chip can be represented 
as a square lattice of K 4 4 bipartite graphs, that D-Wave calls a Chimera graph. 
Even for Ising models with relatively sparse graphs, it may not be possible to 
directly map the Ising model onto the Chimera graph. 

For example, even the simple Ising model for 3-OR (the "Ha gadget) shown 
in Figure cannot be directly mapped onto the Chimera graph. An easy way 
to see this is to observe that the graph of FI 3 contains a 3-cycle, whereas the 
smallest cycle possible on the Chimera graph is a 4-cycle. However, the graph of 
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Figure 4: Embedding 3-OR onto the Chimera Graph 


"Ha can be mapped onto the Chimera graph using minor embedding. One such 
embedding is shown in Figure]^ In the figure, the logical qubit zi is mapped 
to two physical qubits, which are ferromagnetically coupled with a coupling 
strength -JpM- The h and J values shown in the figure dehne an embedded 
Hamiltonian which only uses connections that are available in the Chimera 
graph. It is easily verified that the ground state of the embedded Hamiltonian 
corresponds to the ground state of the original Hamiltonian 

Choi [S] showed that any graph can be minor embedded into a sufficiently 
large Chimera graph. Heuristic embedding algorithms have been developed [5] 
to generate minor embeddings for Chimera graphs with faulty qubits. The D- 
Wave software includes a heuristic embedding tool that can be used to find 
embeddings using the working qubits on an actual D-Wave chip. 

Using the D-Wave heuristic embedding tool, we mapped the graph of the 
logical Ising model, that was shown in Figure into the hardware graph of our 
D-Wave processor which had 504 working qubits. Figurej^shows the embedding 
that we used, consisting of 253 physical qubits, with a maximum chain length 
(number of physical qubits per logical qubit) of 8. The coefficients of the physical 
Ising model are determined as in [7]. 


2.3.2 Gauge Transformations 

In the context of quantum annealing, a gauge transformation [3] is a transfor¬ 
mation of the Ising spin variables 

Si = GA 


where Gi € {—1,4-1}. The gauge transformation induces a transformation on 
the physical Ising model coefficients 


h' = GA 

Jij — GiGjJij 


11 





Figure 5: Embedding the problem onto the D-Wave hardware: physical qubits 
corresponding to the same logical qubit have the same color and are labeled 
with the same number; the unlabeled red qubits are known faulty qubits and 
are not used. 
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1 2 3 4 9 10 11 12 




Figure 6: Example Gauge Transformation 


so that the Hamiltonian is invariant. 

While the gauge transformation generates a problem that is mathematically 
equivalent to the original problem, in practice it has been found that gauge 
transformations can mitigate some of the intrinsic control error (ICE) of the D- 
Wave hardware [14] and that the choice of the gauge G can affect the probability 
of hnding optimal solutions [H]. 

Consider for example the gauge transformation G shown in Eigure The 
figure depicts the action of G on the first unit cell, where the red qubits are 
flipped by G while the blue qubits are unchanged; i.e. all of the horizontal 
qubits are flipped while the vertical qubits are unchanged. In the next unit cell, 
all of the vertical qubits are flipped while the horizontal qubits are unchanged; 
and so on in an alternating pattern. 

We have found that this gauge transformation G is particularly helpful in 
connection with embedding. As mentioned earlier, the embedding process maps 
a single logical qubit to multiple physical qubits that are “chained” together 
with a ferromagnetic coupling -Jfm- Furthermore, to enforce the embedding 
constraints (i.e. that all the physical qubits in a chain should take the same 
value), the coupling strength Jfm is generally made to be the dominant coupling 
in the Hamiltonian. The gauge transformation G has the property that all the 
ferromagnetic couplings -Jfm in each chain are replaced by antiferromagnetic 
couplings J-Jfm- This can help to mitigate certain types of systematic ICE 
errors that are magnified by long ferromagnetically coupled chains. 

2.4 Solution for 8(2,4) Using D-Wave Processor 

Using the reduced proposition P', mapped to a logical Ising model using the 
gadgets shown in Figure and the embedding shown in Figure we used 
the 504-qubit LM/USC D-Wave processor to solve the resulting physical Ising 
model using quantum annealing. 
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Since the minimum identifying code problem for 3(2,4) is small enough to 
solve by brute force, we know that the reduced proposition P' should be satisfied 

by 

Xs = Xio = Xi3, = Xu = l 

(and the rest of the variables zero). Combined with the assumption X 3 = 0:4 = 1, 
this correspond to a minimum code size of 6 . 

For the non-gauge transformed problem, we found that the D-Wave machine 
did not obtain the optimal solution. In 480 experiments totaling 630,000 anneal¬ 
ing runs, over a range of different settings for the annealing time, we obtained 
zero occurrences of the optimal solution; the best that we found were solutions 
corresponding to a code size of 7. 

On the other hand, using the gauge transformation G shown in Figure in 
240 experiments totaling 825,000 annealing runs (again using a range of settings 
for the annealing time), we obtained the solution 


Xg ^10 ^13 ^14 1 

corresponding to the minimum code size of 6 in 25 experiments. So, while the 
ground state probability was still very low in the gauge transformed problem, 
there was a noticeable difference using the gauge transformation. 

2.5 Scaling for Larger Cases 

While the ^6(2,4) case was the largest that could be solved on the 504-qubit 
LM/USC D-Wave processor, we can make some rough estimates of how the 
qubit resource requirements scale for larger cases. For binary (d = 2) de Bruijn 
graphs, we extended the techniques described above to generate Ising models 
for the minimum identifying code problem for the cases n = 3 through n = 6 . 
These models were then embedded into ideal Chimera graphs of various sizes 
using the D-Wave heuristic embedding tool. Figure shows how the number of 
qubits needed for the embedding grows as a function of n. 

Since the embedding algorithm is heuristic, it is possible that smaller em¬ 
beddings could be found, so the embedding sizes shown here should be viewed 
as upper bounds. We computed embeddings firstly for larger versions of the 
current Chimera architecture, which has 8 -qubit unit cells, as well as for 
hypothetical Kg g (16-qubit unit cell) Chimera graphs. For the K 4 4 Chimera 
graphs, we started with the current Vesuvius architecture, which contains 64 
8 -qubit unit cells arranged in an 8 x 8 square grid, and increased the grid size 
to 12x12, 16x16, 24x24, 32x32, and so on until the embedding was successful. 
For the Kg^g graphs, we started with an 8 x 8 grid of 16-qubit unit cells, and 
increased the grid size to 16x16,24x24, and 32x32, until the embedding was 
successful. The plots show that for any given case, the embedding size on the 
it's ,8 Chimera is smaller than on the 7^4 4 Chimera due to the higher graph 
connectivity of the Kg^g Chimera architecture; in other words, qubit resource 
requirements depend on the hardware graph connectivity. 
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Figure 7: Estimated scaling of qubit resource requirements for minimum iden¬ 
tifying code problem on binary de Bruijn graphs B{2,n) 
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From the figure, we can project roughly when the D-Wave processor would 
have sufficient qubits to accommodate the larger cases of the minimum iden¬ 
tifying code problem for undirected binary de Bruijn graphs. If we define a 
processor generation to be 4 times the number of qubits as the previous gener¬ 
ation (e.g. the “Vesuvius” generation had a 512-qubit design whereas the prior 
“Rainier” generation had a 128-qubit design), we can state that in roughly 1.5 
generations we would be able to fit the B{2, 5) case on the processor. This is 
the largest d = 2 case for which the minimum identifying code size is presently 
known. In one more generation beyond that, we would be able to fit the B(2,6) 
case on the processor, for which the minimum identifying code size is presently 
not known. 

Whether these hypothetical D-Wave processors would actually be able to 
solve these larger cases, will depend on the performance characteristics of those 
machines which is yet to be demonstrated. The hardware intrinsic control errors 
would need to be significantly reduced from the current levels. For example, the 
embedding we found for the B{2, 6) case on the 1^4,4 Chimera architecture had 
a maximum chain length of 63 qubits. Solving a problem with chain lengths 
this large would probably require greater control precision (the current Vesu¬ 
vius design has 4 bits of precision m)- On the RTg.s Chimera architecture, the 
embeddings are less complex; e.g. the embedding we found for the B(2,6) case 
only had a maximum chain length of 25 qubits. However, it is not clear how 
difficult it would be for D-Wave to achieve a 16-qubit unit cell design. 

Due to risk factors such as these, it could take longer than the projected 
number of processor generations before a solution can be found to the B{2,6) 
case. On the other hand, it may be possible to break the problem down into 
subproblems, as illustrated earlier in Subsection |2.1[ which may make it easier 
to fit the problem onto the D-Wave processor. 

2.6 Other Problem Formulations Considered 

While the satisfiability formulation provided the best result, we include two 
other methods here for the sake of completeness. It should be noted that while 
these methods were not practical for our sample problem, they may provide 
advantages for other problems. 

2.6.1 Integer Programming Model 

From |24j . we have the following integer program formulation of the minimum 
identifying code problem in graphs. 

First, we define the modified adjacency matrix as follows. It is the adjacency 
matrix plus the identity matrix. 



1, ii {i,j) e E 01 i = j; 
0, otherwise. 


Using this definition, we see that a ball of radius 1 centered at vertex i is 
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given by the following vector. 


B{i) — [An, A2i ,..., Ani[^ 


Our vertex subset S is defined as the following vector. 


5” = [si, S 2 , ■ • •, SnY' where s* 


1, if * G S] 

0 , otherwise. 


To compare two identifying sets with respect to S for vertices i and j, the 
following expression computes the size of {B{i) n S)A{B{j) n S). 


n 

^ ^ \Aki I ‘ 

k=l 


This implies that in order for S' to be a valid identifying code, we must have 
the following inequality satisfied for all pairs of vertices i and j. 

n 

^ ^ \ A}ii I * ^ 1 

fc=l 


For the dominating property to be satisfied, we require the following addi¬ 
tional inequality. 

y4-S> 1^ 

Thus our integer program is given by the following, 
min |S| 

S-t. YJk=l\^ki - Akj\ ■ Sk > 1 , 

A-S >1^ 

Sfe G {0,1} 


In order to use these ideas for the D-Wave machine, our constraints must be 
equalities. This means we must add binary slack variables for each inequality. 
For the first set of inequalities, we must determine an upper bound for each 
inequality. Since these correspond to the constraint |(i3(i)nS')A(i3(j)n5')| > 1, 
an easy upper bound is given by the following. 


\B{i)\ + \BU)\ > \{B{i) n S)A{BU) n 5)1 > 1 


For the class of de Bruijn graphs, we are able to use this to get a bound on 
the number of slack variables needed. Since the maximum size of any ball in 
B{d,n) is 2d + 1, this gives us an upper bound of size Ad + 2 for this class of 
graphs. Hence for each inequality in this set, we must add 4d -|- 2 binary slack 
variables to convert the inequality to an equality. Using a variable reduction 
method from we can further reduce this number of variables to log(4fi-|-2). 
Since there are possible pairs i,j, this implies that we must add a 
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huge number of binary slack variables, equal to the following expression, just to 
satisfy the first set of inequalities. 

d"(d"-l)log(4d + 2) , , 

- slack variables 

2 

Even in the case of ,8(2,4), this means that we will need to add 320 variables 
to our list - an enormous number when compared to the graph size of 16 nodes. 
Hence, this method is not going to be an efficient way to map our problem onto 
the D-Wave machine. 

2.6.2 Binary Optimization Model 

We present a binary optimization formula for the minimum identifying code 
problem. Adjustments must be made to create a quadratic version (QUBO), 
making this approach impractical for large scale results. We will define this 
model using three separate functions: one to show that the set has the correct 
size, one to show that the set is dominating, and one to show that the set is 
separating (or identifying). 

Variable Definitions 

We will use the notation B{v) for v G V{G), where B{v) = N{v) U {w}. In 
other words, B{v) is the set containing all vertices adjacent to v, plus v itself. 
This is referred to in graph theory as the ball of radius one centered at v. 

We define the variables as follows. 

^ J 1, if z G B{v)-, 

( 0, otherwise. 

Set S has size k 

We define the first function, Ha, as follows. 

B^A — (^ ^vv') 

= 0 iff 181 = k. 

Set 8 is a dominating set 

By definition, this is equal to Vu G G, B{v) n 8 0. This is equivalent to 

the following. 
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From this statement, we get the following equation for our second function. 

Hb = ~ Xyy) ■ (nii«G£;(l ~ ^uv)) 

Set S' is a separating set 

By definition, this is equal to Vx,?/ S G, {B{x) n S)A{B{y) n S) yf 0. This 
is equivalent to the following for a specific pair x ^ y. 


(B(x) n s)A(B(j/) n s) ^ 0 


O 3v &{B{x)nS)A{B{y)nS) 

O 3z;, (v G B{x) n S) 0 (r G B{y) n S) 

GG dX, {XxV 1) 0 {Xyy 1) 

GG 3X, (1 i^XxV 0 Xyt;) 0) 

GG (1 ‘^xv — 0 

V 


From this statement, we get the following equation for our third function, 
summed over all pairs x,y. 


^yv) 

The Binary Optimization Model 

From these three functions, our binary optimization model is the following. 

H{S) = Ha{S) + Hb{S) + HciS) 

= 0 iff S is an identifying code. 

Note that while this does provide a binary optimization model for our prob¬ 
lem, it is not quadratic. In order to convert H (S) to a quadratic binary equation, 
each higher order term must be replaced with several new variables. While this 
is possible, it is a time-consuming and arduous process that introduces many 
new variables. Hence this approach is not the most efficient implementation. 


3 Other Approaches to Solving the Problem 

3.1 Parallel Computing 

The traditional brute force approach to solving the minimum identifying code 
problem constructs all possible subsets from smallest to largest size, and checks 
whether or not the set is a valid identifying code. Parallelizing our algorithm 
(implemented in Matlab using the Parallel Computing Toolbox) requires moving 
the construction of subsets inside the parallelized loop. Because of the expo¬ 
nential increase in the number of subsets created, it is more efficient to generate 
each subset within the loop and discard it after the iteration than to store all 
) fc-subsets and traverse through the list. This is done using a fc-subset un¬ 
ranking algorithm. Two of these algorithms (from [TB]) are listed as Algorithms 
and These unranking functions allow us to completely parallelize the brute 
force algorithm, and the results obtained are listed in Figure 
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Algorithm 1 Revolving Door Unranking Algorithm 


1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 


procedure REvDoOR(r,/c, n) 
X = n 

for i = k \ do 

while (®) > r do 


a; = a; — 1 

end while 

U = a; + 1 



end for 

end procedure 
return T = • ,4) 


> subset index, subset size, set size 



Algorithm 2 Lexicographic Unranking Algorithm 

1: procedure LExUNRANK(r, fc, n) > subset index, subset size, set size 

2: a: = 1 

3: for i = 1 ■. k do 

4: while r > do 

5 ^ {'[Zd 

6: a; = a; + 1 

7: end while 

8 : ti = X 

9: X = a; + 1 

10: end for 

11: end procedure 
12 : return T = (ti, ^ 2 , • ■ •, tk) 



(a) Minimum Size (b) Runtime (sec) 

Figure 8 : Results for B{d,n) obtained using parallel computing and the corre¬ 
sponding runtime. 


20 






3.2 SMT Solvers 


Satisfiability Modulo Theory (SMT) is a current area of research that is con¬ 
cerned with the satisfiability of formulas with respect to some background the¬ 
ory m- SMT solvers combine boolean SAT solving with decision procedures for 
specific theories. For example, consider the following problem. 

0 = 6-1-1, c < a, c > b 

In the theory of the integers, this problem is not satisfiable (there are no integers 
a, b, c where all the expressions are true), however in the theory of the real 
numbers it is satisfiable (for example, with a=ll, b=10, c=10.5). In general, 
solving an SMT problem consists of first solving a SAT problem, then doing 
theory-specific reasoning, and then possibly going back and changing the SAT 
problem. This process is repeated if necessary. In addition, multiple theories can 
also be used in the same SMT problem instance, which may require additional 
repeats of this method. 

To use SMT solvers on our identifying code problem for the undirected de 
Bruijn graph, we must first come up with a formulation of the problem using 
decision procedures. The graph B{d,n), contains d” nodes. For each of these, 
we create a boolean variable that denotes whether or not the node is part of 
the identifying code. We then also create an array of boolean variables for that 
node’s identifying set. An assertion is added to make sure that each element of 
the array is true if and only if the corresponding neighbor’s boolean variable is 
true (i.e. if and only if the neighbor is part of the identifying code). To ensure 
unique codes, we add a statement to require that each node’s identifying set is 
unique from every other node’s identifying set. Then, to get codes of a fixed 
size, we create an integer variable for each node and add the constraints that the 
integer is at least 0 and no greater than 1. Next we add an assertion that each 
node’s integer variable is 1 if and only if its boolean variable is true. Finally, 
we add a constraint that the sum of all of the integer variables is equal to the 
desired identifying code size. 

Now that the formulation of the problem has been determined, we can use 
a commercial SMT solver to find solutions. For this work, we used the solver 
Z3, made by Microsoft Research. We begin by first picking a code size, and 
asking if there exists an identifying code of that size. If not, then the code size 
is increased by 1 and the problem is posed to Z3 again. This continues until 
an identifying code of a specific size is found. To find all satisfying models, 
after a single model was found an assertion is inserted into the formulation 
that requires that the the previously found identifying code be eliminated as 
an option. This forces Z3 to produce a different solution, or to state that the 
formulation is unsatisfiable (and hence no more identifying codes of that size 
exist). This process is repeated in a loop to obtain all identifying codes. 

Using this approach on a single core, we were able to reproduce our results 
for B{d, n) from the HPC method in much less time. See Figure|^for a summary 
of these results. The numbers in parentheses denote that we found a code of 
that size, but did not eliminate the possibility of a smaller code existing. The 
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8 

(10) 











(a) Minimum Size (b) Runtime (sec) 

Figure 9: Minimum identifying codes on B{d, n) and the corresponding runtime 
for SMT solvers. 


times given are determined by the time required to find one solution of minimum 
size in addition to the time required to determine that no smaller code exists. 

Because of the advancements in current SAT and SMT solvers, they offer the 
potential to scale much better than a parallelized brute force approach. This 
is due in part to the fact that many of today’s solvers are capable of realizing 
which subsets of assignments will define an unsatisfiable result, and hence they 
will avoid models in which those statements are set. In our problem, this might 
correspond to a case in which nodes A and B have the same identifying set. 
In this case, the solver would not bother looking at combinations of True/False 
assignments on the other nodes that do not affect the identifying sets of A or 
B. 

In addition to the sophistication of today’s solvers, there is also the possi¬ 
bility of parallelizing the search. While some instances were run manually in 
a parallel manner for this experiment, there is some research to be done on 
automatically parallelizing the search in order to further our known minimum 
results. 


4 Conclusions 

The various methods discussed provide pure mathematicians with a range of 
opportunities for collaboration with scientists from several disciplines, such as 
computer science and physics. The methods explored in this study included 
a parallel computing approach using Matlab, an adiabatic quantum optimiza¬ 
tion approach using a D-Wave quantum annealing processor, and lastly using 
satisfiability modulo theory (SMT) and corresponding SMT solvers. From the 
base cases that we constructed using our variety of approaches, several new 
conjectures have been developed and eventually proven true [1], however the 
leg work needed to compute the base cases required a deep understanding of 
various computing techniques. 
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